Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end-to-end test report generation (and minor refactoring) #194

Merged
merged 5 commits into from
Nov 28, 2024

Conversation

yfyf
Copy link
Collaborator

@yfyf yfyf commented Oct 23, 2024

My attempt to produce a nice e2e test summary / report has turned into a ridiculous yak-shaving mission.

Some background:

  • Spent hours Googling for simple CLI tools for language-agnostic automated testing of binaries, generating test reports, test report formats supported by GH, etc.
  • At first was hoping to piggyback on BATS and it's JUnit output + this GH Test Reporter. Issues: buggy (produces broken XML output), no option to suppress log output and just print a summary to console after running tests. Almost works, but not quite.
  • I don't even want to get started on TAP and the tooling. What a mess.
  • Tried generating a CTRF report, but the generated GH summary is kinda ridiculous and there's no tooling for printing a summary to console.

So... decided to roll manually 🤷

I am not very pleased with the outcome, but it's still better than nothing. If you have any ideas on how to do this in a neater way I am all ears.

Some points to note:

  • passthru.tests (i.e. nix-build -A test <..> default.nix) never fails now, instead the exist status is stored in $out/status (see build). I guess this could be avoided by using --keep-failed instead and getting the test run artifacts from the temporary build dir?
  • Normal and interactive e2e test versions are stored in ./result with naming based on their source filepath, i.e.:
    tree -l ./result/tests/ | sed 's/->.*$//'
    ./result/tests/
    ├── interactive
    │   ├── application
    │   │   └── kiosk-persistence
    │   └── base
    │       └── proxy-and-update
    └── tests
        ├── application
        │   └── kiosk-persistence
        └── base
            └── proxy-and-update
    

Example of a test summary printed to the console:

## End-to-end test result summary:
- base/proxy-and-update: FAIL ✗ (duration: 0:02:42)
- application/kiosk-persistence: OK ✓ (duration: 0:01:12)

Ran 2 tests, passed: 1, failed: 1

Checklist

  • Changelog updated
  • Code documented
  • User manual updated

@yfyf
Copy link
Collaborator Author

yfyf commented Oct 23, 2024

GH test summary can be seen here:
https://github.com/yfyf/playos/actions/runs/11482660452#summary-31956262032

For some reason Github renders the logs in <pre> with extra empty lines, not sure why, will look into it.

Edit: it seems whitespace handling is kinda broken on the summary page, though the markdown gets rendered differently if I paste it into a Gist, for example. 🤦

@yfyf yfyf requested a review from knuton October 23, 2024 15:29
@knuton
Copy link
Member

knuton commented Oct 23, 2024

So... decided to roll manually 🤷

Right. I had a go at TAP/bats for something else some time ago, and walked way from it. It always feels like it should be good, though. Maybe something hasn't clicked yet.

Help me understand the implications for our codebase. If you had done it via bats/TAP, would we have used a library that would have generated/help generate the right output? And then the tool would have transformed that to some summary? Or would we have still generated the right format of output ourselves, and only apply the tool to create the summary? Maybe the only extra tax is creating that markdown report manually?

Output in your failed test looks like a good thing! On main right now, it's really hard to see what failed at a glance.

@yfyf
Copy link
Collaborator Author

yfyf commented Oct 23, 2024

Help me understand the implications for our codebase. If you had done it via bats/TAP, would we have used a library that would have generated/help generate the right output? And then the tool would have transformed that to some summary?

The BATS approach was:

  • Wrap each e2e test (driver) $fooTest into a shell script ${fooTest}.bats containing @test "$fooTest" { $fooTest }
  • Link farm all the tests into some $testDir
  • Run the tests with bats --timing --recursive --report-formatter junit -o test-report.xml $testDir

Issues with this:

  1. No console output while tests are running, which sucks when running e2e tests because they take a while, might be hanging, etc. No flags to change this.
  2. The generated test-report.xml is broken because BATS doesn't escape the log output (could be fixed by piping the output in the wrapper files).
  3. If any test fails, it will dump all of the logs, no option to suppress this.
  4. Final summary printed contains only total counts.

3 + 4 means that if anything fails, there's no good way to see an overview in the console, because individual test outcomes are interspersed by the failing test logs.

If it were only for the GH summary report, I would say that yeah, BATS + post-processing the XML is probably a better option. But now it just seems like a big indirection with questionable benefits.

@yfyf
Copy link
Collaborator Author

yfyf commented Oct 23, 2024

Log formatting is now properly preserved in the summary page:
https://github.com/dividat/playos/actions/runs/11484431866?pr=194#summary-31962190652

@knuton knuton added the reviewable Ready for initial or iterative review label Nov 5, 2024
@knuton
Copy link
Member

knuton commented Nov 28, 2024

Rebased with main.

Copy link
Member

@knuton knuton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to merge? I rebased and dropped the sample error commit.

passthru.tests (i.e. nix-build -A test <..> default.nix) never fails now, instead the exist status is stored in $out/status (see build).

This seems relatively unproblematic for our current needs. We do run them in CI. I guess it might surprise any consumers of the top-level package, but to our knowledge the only "external" consumer is also us.

@yfyf
Copy link
Collaborator Author

yfyf commented Nov 28, 2024

I think this is good to merge?

Yes.

I am just thinking that it would be also nice if the CLI summary report was re-used for integration tests as well. We have 9 of them now and the current for-loop fail-on-first-err in ./bin/test implementation is not ideal for local development. A summary table on Github would be helpful too. This could be a separate PR tho.

@knuton
Copy link
Member

knuton commented Nov 28, 2024

I am just thinking that it would be also nice if the CLI summary report was re-used for integration tests as well. We have 9 of them now and the current for-loop fail-on-first-err in ./bin/test implementation is not ideal for local development. A summary table on Github would be helpful too.

👍

This could be a separate PR tho.

👍

@knuton knuton marked this pull request as ready for review November 28, 2024 14:36
@knuton knuton merged commit d7a33e9 into dividat:main Nov 28, 2024
14 checks passed
@knuton knuton removed the reviewable Ready for initial or iterative review label Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants